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ADAPTIVE STRATIFIED MONTE CARLO ALGORITHM FOR NUMERICAL 

COMPUTATION OF INTEGRALS 

TONI SAYAH t 


Abstract. In this paper, we aim to compute numerical approximation integral by using an adaptive 
Monte Carlo algorithm. We propose a stratified sampling algorithm based on an iterative method which 
splits the strata following some quantities called indicators which indicate where the variance takes 
relative big values. The stratification method is based on the optimal allocation strategy in order to 
decrease the variance from iteration to another. Numerical experiments show and confirm the efficiency 
of our algorithm. 
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1. Introduction 

This paper deals with adaptive Monte Carlo method (AMC) to approximate the integral of a given func¬ 
tion / on the hypercube [0, G IN*. The main idea is to guide the random points in the domain 

in order to decrease the variance and to get better results. The corresponding algorithm couples two 
methods: the optimal allocation strategy and the adaptive stratified sampling. In fact, it proposes to 
split the domain into separate regions (called mesh) and to use an iterative algorithm which calculates 
the number of samples in every region by using the optimal allocation strategy and then refines the parts 
of the mesh following some quantities called indicators which indicate where the variance takes a relative 
big values. 

A usual technique for reducing the mean squared error of a Monte-Carlo estimate is the so-called strati¬ 
fied Monte Carlo sampling, which considers sampling into a set of strata, or regions of the domain, that 
form a partition (a stratification) of the domain (see [5] and the references therein for a presentation 
more detailed). It is efficient to stratify the domain, since when allocating to each stratum a number of 
samples proportional to its measure, the mean squared error of the resulting estimate is always smaller 
or equal to the one of the crude Monte-Carlo estimate. For a given partition of the domain and a fixed 
total number of random points, the choice of the number of samples in each stratum is very important 
for the results and precision. The optimal allocation strategy (see for instance [3] or [T]) allows to get the 
better distribution of the samples in the set of strata in order to minimize the variance. We give in the 
next section a brief summary of the this strategy which will be the basic tools of our adaptive algorithm. 

In the other hand, it is important to stratify the domain in connection with the function / to be inte¬ 
grated and to allocate more strata in the region where / has larger local variations. Many research works 
propose multiple methods and technics to stratify the domain: [T] for the adaptive stratified sampling for 
Monte-Carlo integration of differentiable functions, [5] for the adaptive integration and approximation 
over hyper-rectangular regions with applications to basket option pricing, |3], .... 

The paper is organized as follows. Section 2 describes the adaptive method. We begin by giving a 
summarize of the optimal allocation strategy and then describe the adaptive algorithm consisting in 
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stratifying the domain. In section 3, we perform numerical investigations showing the powerful of the 
proposed adaptive algorithm. 

2. Description of the adaptive algorithm 

In this section, we will describe the AMC algorithm which is based on indicators to guide the repartition 
of the random points in the domain. In our algorithm, the indicators are based on an approximation 
of the variance expressed on different regions in the domain. We detect those where the indicators are 
bigger than their mean value up to a constant and we split them in small regions. 


2.1. Optimal choice of the numbers of samples. Let D = [0,be the unit hypercube of 11“*, 
d G N*, and / : D —> R a Lebesgue-integrable function. We want to estimate 

^{f) = [ f{x)d\{x), 

J D 

where A is the Lebesgue measure on R*^. 

The classical MC estimator of I(/) is 


N 






where Ui,l < i < N, are independent random variables uniformly distributed over D. Tmc(/) is an 
unbiased estimator of I{f), which means that E[Tmc(/)] = 2i(/). Moreover, if / is square-integrable, the 
variance of iucif) is 


Var(lMc(/)) = 


N 


where 


\f) = f{xfd\{x) - f{x)d\{3 


Variance reduction techniques aim to produce alternative estimators having smaller variance than crude 
MC. Among these techniques, we focus on stratification strategy. The idea is to split D into separate 
regions, take a sample of points from each such region, and combine the results to estimate T(/). Let 
{Di ,..., Dp} be a partition of D. That is a set of sub-domains such that 

p 

D = \^ Di and Di n Dj = 0 for i ^ j. 

i=l 

We consider p corresponding integers ni,... ,np. Here, rii will be the number of samples to draw from 
Di. For 1 < z < p, let Oi = / dX{x) be the measure of Di and Di{f) = / f{x)dX{x) be the integral 

J Di J Di 

p p ^ 

of / over Di. We have X{D) = a* and X{f ) = Furthermore, for 1 < z < p, let tt^ = — 

1-1 

t—1 

be the density function of the uniform distribution over Di and consider a set of rii random variables 
... ,Xn} drawn from tt^. We suppose that the random variables 1 < J < n^, 1 < z < p, are 
mutually independent. 

For 1 < z < p, let Si be the MC estimator of Xi{f) defined by: 


5. = - 

n - ' 


(0 

k • 




Then, the integral T(/) can be estimated by: 


p 

isMcif) = ^ 
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We call IsMcif) the stratified Monte Carlo estimator of I(/). It is easy to show that Ismc(/) is an 
unbiased estimator of I(/) and, if / is square-integrable, the variance of Ismc(/) is 


Var(isMc(/))=E“? 


^?(/) 


where 


^Uf) = J f{xfdTT^{x) - (^J f{x)dTri{x)j , \/l<i<p. 


The choice of the integers Ui, i = 1,... ,p is crucial in order to reduce Var(IsMc(/))- A frequently made 
choice is proportional allocation which takes the number Ui of points in each sub-domain Di proportional 

p 

to its measure. In other words, if then Ui = Noi, i = 1,... ,p. 

i=l 

For this choice, we have 

Var(iMc(/)) = Var(XsMc(/)) + 4 E 


Oi 


and hence, Var(IsMc(/)) < Var(lMc(/))- 

To get an even smaller variance, one can consider The optimal allocation which aims to minimize 


V{ni,...,np) 


Hi 


as a function of ni,..., Up, with N ~ Let 




N 

Using the inequality of Cauchy-Schwarz, we have 

faiuiif) apap{f)\ 1 ^ 


V 


V <5 




N 

1 




N 


\i=l ‘ / i=l 

< U(ni,.. .,np). 


Hence, the optimal choice of ni,..., rip is given by 


azMf) ■ 1 

n, = ---, i = 1,... ,p. 


( 2 . 1 ) 

In order to compute the number of random points in Di using ( |2.1[ ), one can approximate Ui{f) by: 

^Uf) = — 

m 

For 1 < z < p, we will denote 


rii ^ ni 2 

(2.2) 

i=i *i=i 


ni= - 

0 

(2.3) 


(2.4) 


i=l 


where 
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2.2. Description of the algorithm. The adaptive MC scheme aims to guide, for a fixed global number 
N of random points in the domain D, the generation of random points in every sub-domain in order to 
get more precise estimation on the desired integration. It is based on an iterative algorithm where the 
mesh (repartition of the sub-domains in D) evolves with iterations. Let L be the total number of desired 
iterations and D^, 1 < ^ < L, be the corresponding mesh such that 

pt 

= [j dI and dI n = 0 for i ^ j, 

where p£ is the number of the sub-domains in D^. We start the iterations with a subdivision of the 
domain — D using pi identical sub-cubes with given equal numbers of random points in each 

Pi 

sub-domain Dl,i = 1,..., riiq, such that N = 


The main idea of the algorithm consists for every iteration 1 < ^ < L, to refine some region Di, 1 < i < pi, 
of the mesh where the function / presents more singularities (big values of the variance) and hence 
must be better described. This technique is based on some quantities called indicators and denoted Vi^£ 
which give informations about the contribution of Df in the calculation of the variance of the MC method 
at this level, approximated by 

Pi 

= ( 2 - 5 ) 


where 






( 2 . 6 ) 


Our goal is to decrease Vi during the iterations. Then, for every refinement iteration i with a correspond¬ 
ing mesh and corresponding numbers we calculate and 5i, and update fii^i by using the 

optimal choice of the numbers of samples based on the formulas (2.21, (2.41 and (2.3) for all the sub- 
domains i = 1,... ,pi. For technical reason, we allow a minimal number, denoted by M^p (practically 
we choose M^p = 2), of random points in every sub-domain and then if fii^i < M^p we set = M^p. 
Next, we calculate the indicators Vi^i and Vg, and then, we adapt the mesh to obtain the new one 
The chosen strategy of the adaptive method consists to mark the sub-domains Df such that 


T/ \ T/l* 


where Cm is a positive constant bigger than 1 and is the mean value of Vi^i defined as 


T/mean 



(2.7) 


and to divide every marked sub-domains into small parts, four equal sub-squares for d = 2 and eight 
equal sub-cubes for d = 3, with equal number of random points in each part given by 


{ max(^^,Mp,r) for d = 2 

max(Mpr) for d = 3. 
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Remark 2.1. We stop the algorithm if the number of iterations reaches L or if the calculated variance 
is smaller that a tolerance denoted by e. We denote by ig the stopping iteration level of the following 
algorithm which corresponds to a desired tolerance e or at maximum equals to L. 


The algorithm can be described as following : 


(Algo 1) : For a chosen N with corresponding numbers n^^i, and a given initial mesh with corre¬ 
sponding sub-domains Dl,i = 1,... ,pi. 
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Generate i = 1,. . . ,pi random points Xj,j = l,...,ni^i in every sub-domain Dj . 
set £ = 1. 

calculate Ve by using ( [ 2 T 5 I ) . 

While 1<L or Ve < £ 

calculate (Ji,eif) and, <5^ and update ni^e,i = 1, . . . ,pe by using ( |2.2[ ) , 

Generate corresponding random points Xj,ji = 1,... , in each sub-domain Dl,i = 1,. . . ,pe ■ 
calculate Vi,f, i = 1,. . . ,p; and by using ( |2 . 6p and ( |2 . 7| ) . 

for (i = 1 : pe) 

if {Vi,e > Cm 

Divide the sub-domain DI in m small parts (m = 4 in 2D and m = 8 in 3D) . 

Associate to every one of this small parts the number of random points maLx(—^,Mpr). 

m 

set pe = pe+m. 
end if 
end for 
£ = £+1. 
end loop 
£e=£-l. 

fc=i 

The previous algorithm calculate an approximation of !(/) with an adaptive Monte Carlo method. If we 
are interested by the numerical variance, we repeat the previous algorithm times and approximate 
the T(/) by the corresponding mean value 




calculate the adapt MC approximation Xamc = / ^ 

E -—•/ n: 




(2.4) and (2.3). 


'-AMC — 


N,. 


N,ss 

E 

i=l 


where 1\mc corresponds to the essay using (Algo 1). 

The estimated variance will by given by the formula 

VaMC = -f-j ( E (^AMc)"^ ~ Ness^AMC 

■^^ess -L ^ 


In fact, it is useless to repeat the (Algo 1) Ness times to calculate Iamc and VamCj and it is expensive 
for the CPU time. We can reduce the coast by running (Algo 1) one time to define the mesh and to 
get Iamc then, we use the corresponding sub-domains = l,...,p^^ with the corresponding 

number of random points = 1,... ,£e to perform the rest of calculations {Ness — 1 essays). The 

corresponding algorithm can be describe as follow : 


(Algo 2) : 


Call algorithm (Algo 1) to define the mesh DV ,i = 1, . ■ ■ ,Pie and calculate Iamc 
Set tie = 2 
While ne<Ness 

Generate corresponding random points Xj, j = 1,... , in each sub-domain = 1,. ■ ■ ,Pte 


Calculate X 


Ptc 

AMC = E Ef ^ 
i — 1 '‘Ae 


Set Tie = Re + 1 
end loop calculate 

N, 

I 

Xamc = 


X, 


^E^ 

i = l 


AMC ca.lcula.'t© 

iVe, 


VaMC — “ ^ess^AMC^ 


i=l 
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3. Numerical experiments 

In this section, we perform in MATLAB several numerical experiments to validate our approach and we 
compare between the MC and AMC methods. 


3.1. 2D validations. We consider the unit square D = [0,1[^, Cm = 2, Mpr = 2 and = L. The initial 
mesh is constituted by a regular partition with Nq = A segments in every side of = D (see figure [^. 


Figure 1. Initial partition D]C = 1 ; ■ • ■ iPi {Pi = 16) with Nq = 4. 


In this section, we show two particular cases of the function /. The first treats an integrable but not 
continuous function which presents a discontinuity along the border of the unit disc. The second one 
treats a function concentrated in a part of D and vanishes in the rest on this domain. Both examples 
show the powerful of the proposed AMC method. 


3.1.1. First test case. 

For the first test case, we consider the function fc given on D by 

f / I _ / 1 if + 2/2 < 1 

1 0 elsewhere. 


The exact integration of fc over D is equal to 


1 = 


fc{x,y)dxdy 


JD 

which is the quarter of the surface of the unit disc. 


TT 

4’ 


We begin the numerical tests with the algorithm (Alog 1). Figures |2|^ show for N = 10000 and L = 6 
the evolution of the mesh and the repartition of the random points during the iterations. We remark that 
this points are concentrated around the curve = 1 where the function fc represents a singularity. 



Figure 2. Mesh for the second iteration. 



Figure 3. repartition of the 
points (second iteration). 
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Figure 4. Mesh for the fourth iteration. 



Figure 5. repartition of the 
points (fourth iteration). 



Figure 6. Mesh for the sixth iteration. 



points (sixth iteration). 


Figure 1^ shows a comparison in logarithmic scale of the relative errors ~ 


Emc = 


I — Imc 


corresponding to the MC method and 


Eamc = 


I — Iamc 
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corresponding to the AMC method with respect to the number of random points N where the total 
number of the iteration L = 4. As we can see in figure the AMC method is more precise than the 
MC method. Still we have to compare the efficiency of the AMC method with respect to the CPU time 
of computation. In fact, figure shows that for the considered N, the corresponding CPU times for 
the AMC are smaller from those with MC. In particular, the MC method gives for N = 10^ an error of 
Emc = 0.00052 with a CPU time of 0.44s, but the AMC gives for N = 10® an error of Eamc = 0.00008 
with a CPU time of 0.4s. Hence, the powerful of the AMC method. It is also clear that to get more 
precision with the AMC method, we can increase the number of iterations L. 


Next, we consider the algorithm (Algo 2) with N^ss = 100, L = 4. 


Figure 10 shows the comparison of the estimated variance between the classical Monte Carlo (Vmc) 
and adaptive Monte Carlo method (Vamc) in logarithmic scale. As the adaptive algorithm consists to 
minimize the variance, it is clear in this figure that the goal is attended. Figure [D shows in logarithmic 
scale the efficiency of the MC and AMC methods versus the number of random points N by using the 
following formulas (see 0 ) 


Emc * Vmc 
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Figure 8. First test case: 

Emc Eamc with respect 
to N in logarithmic scale. 
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Figure 9. First test case: 

CPU time of the MC and AMC 
methods with respect to N in 
logarithmic scale. 


and 

= _ I _ 

AMC TamC * VamC ’ 

where Tmc and Tamc are respectively the CPU time of the MC and AMC methods. It is clear that the 
efficiency of the AMC method is more important than the MC method. 



Figure 10. First test case: 
Estimated variances Vmc and 
Vamc with respect to N in log¬ 
arithmic scale. 


Figure 11. First test case: Ef¬ 
ficiencies and with 

respect to N in logarithmic 
scale. 


3.1.2. Second test case. 

In this case, we consider the function 

where a is a real positive parameter. We begin the adaptive algorithm with the same initial mesh as the 
previous case and we choose N = 10000. Figures [T2]|T5| show for L — Q the meshes and random points 
repartition with respect to a. When a increase, the mesh and the random points follow the function / 2 ,g 
and focus more and more around the origin of axis. 
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Figure 14. AMC mesh a = -50. 
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X 

Figure 15. AMC mesh a = -100. 


Figure [16] shows for a = —50, Ness = 100 and A = 4, the comparison of the estimated variance between 
MC and AMC methods with respect to N in logarithmic scale. Figure shows in logarithmic scale the 
efficiency of the MC and AMC methods versus the number of points N. One more time, it is clear that 
the efficiency of the AMC method is more important than the MC one. 



Figure 16. Second test case 
{a = —50): Estimated vari¬ 
ances Vmc and Vamc with re¬ 
spect to N in logarithmic scale. 



Figure 17. Second test case 
(a = —50): Efficiencies 
and with respect to N 

in logarithmic scale. 
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3.2. 3D validations. In this section, we consider the unit cube D = [0, Ip. We consider the function 

/3,g(x,2/) = e-“(" +y+^\ 

where a is a real positive parameter. The initial mesh is constituted by a regular partition with Nq = 4 
segments in every side of = D. Figure 18 shows the repartition of the random points for a = —50, 
L = 6 and TV = 10000. 



Figure 18. Mesh Gaussian for a = —50. 

As for the previous case, figure shows for Wss = 100 and L = 4, the comparison of the estimated 
variance between MC and AMC methods with respect to N in logarithmic scale. Figure shows in 
logarithmic scale the efficiency of the MC and AMC methods versus the number of points N. We can 
deduce the same remark for the efficiency of the AMC method in dimension three. 


AMC 


„ amc 





,,.5 


- 




1 




- 
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log10(N) 


loglO(N) 

Figure 19. 3D test case (a = 


Figure 20. 3D test case (a = 


-50): Estimated variances -50): Efficiencies and 

^MC Vamc with respect to with respect to N in log- 

N in logarithmic scale. arithmic scale. 
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